{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Implementation of neural network\n",
    "## Introduction\n",
    "In this task, we use NumPy to implement a basic neural network.\n",
    "1. We build a simple neuron.\n",
    "2. We build a simple neural network.\n",
    "3. We build a simple neural network with backward.\n",
    "4. We train & evaluate the simple neural network.\n",
    "\n",
    "All operations are run on the CPU."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Import packages"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build a simple neuron\n",
    "<img src=\"Images/n.png\" width=\"25%\">\n",
    "Neuron is a basic unit of a neural network. The figure above shows a simple neuron, which can be written as follows:\n",
    "\\begin{equation}\n",
    "y = \\sigma(x_1w_1 + x_2w_2 + b).\n",
    "\\label{eq:neuron}\n",
    "\\end{equation}\n",
    "Here, $[x_1, x_2]$ is the input vector. $[w_1, w_2]$ and $b$ are the weights and bias of the neuron, respectively. $\\sigma(\\cdot)$ is a sigmoid function, which is the activation function of the neuron."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.9525741268224334\n"
     ]
    }
   ],
   "source": [
    "def sigmoid(x):\n",
    "    # Our activation function: f(x) = 1 / (1 + e^(-x))\n",
    "    return 1 / (1 + np.exp(-x))\n",
    "\n",
    "class Neuron:\n",
    "    def __init__(self, w, b, name=''):\n",
    "        self.name = name\n",
    "        self.w = w\n",
    "        self.b = b\n",
    "    \n",
    "    def forward(self, x):\n",
    "        # Weight inputs, add bias, then use the activation function\n",
    "        t = np.dot(self.w, x) + self.b\n",
    "        return sigmoid(t)\n",
    "\n",
    "init_w = np.array([0, 1])    # w1 = 0, w2 = 1\n",
    "init_b = 0    # b = 0\n",
    "n = Neuron(init_w, init_b)\n",
    "\n",
    "x = np.array([2, 3])    # x1 = 2, x2 = 3\n",
    "print(n.forward(x))    # 0.9525741268224334"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build a simple neural network\n",
    "<img src=\"Images/nn.png\" width=\"35%\">\n",
    "The figure above shows a simple neural network, which contains three neurons."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.7216325609518421\n"
     ]
    }
   ],
   "source": [
    "class NeuralNetwork:\n",
    "    '''\n",
    "    A neural network with:\n",
    "    - 2 inputs\n",
    "    - a hidden layer with 2 neurons (h1, h2)\n",
    "    - an output layer with 1 neuron (o1)\n",
    "    Each neuron has the same weights and bias:\n",
    "    - w = [0, 1]\n",
    "    - b = 0\n",
    "    '''\n",
    "    def __init__(self):\n",
    "        init_w = np.array([0, 1])\n",
    "        init_b = 0\n",
    "\n",
    "        # The Neuron class here is from the previous section\n",
    "        self.h1 = Neuron(init_w, init_b, 'h1')\n",
    "        self.h2 = Neuron(init_w, init_b, 'h2')\n",
    "        self.o1 = Neuron(init_w, init_b, 'o1')\n",
    "\n",
    "    def forward(self, x):\n",
    "        out_h1 = self.h1.forward(x)\n",
    "        out_h2 = self.h2.forward(x)\n",
    "\n",
    "        # The inputs for o1 are the outputs from h1 and h2\n",
    "        out_o1 = self.o1.forward(np.array([out_h1, out_h2]))\n",
    "\n",
    "        return out_o1\n",
    "\n",
    "network = NeuralNetwork()\n",
    "x = np.array([2, 3])\n",
    "print(network.forward(x))    # 0.7216325609518421"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build a simple neuron with backward\n",
    "Let's consider a a simple neuron:\n",
    "\\begin{equation}\n",
    "y = \\sigma(x_1w_1 + x_2w_2 + b).\n",
    "\\label{eq:neuron}\n",
    "\\end{equation}\n",
    "We set:\n",
    "\\begin{equation}\n",
    "t = x_1w_1 + x_2w_2 + b. \n",
    "\\end{equation}\n",
    "Then we get:\n",
    "\\begin{equation}\n",
    "y = \\sigma(t)\n",
    "\\label{eq:t}\n",
    "\\end{equation}\n",
    "Then,\n",
    "\\begin{equation}\n",
    "\\frac{\\partial y}{\\partial w_1} = \\frac{\\partial y}{\\partial t} \\times \\frac{\\partial t}{\\partial w_1} = \\sigma'(t) \\times x_1;\n",
    "\\end{equation}\n",
    "\\begin{equation}\n",
    "\\frac{\\partial y}{\\partial w_2} = \\frac{\\partial y}{\\partial t} \\times \\frac{\\partial t}{\\partial w_2} = \\sigma'(t) \\times x_2;\n",
    "\\end{equation}\n",
    "\\begin{equation}\n",
    "\\frac{\\partial y}{\\partial b} = \\frac{\\partial y}{\\partial t} \\times \\frac{\\partial t}{\\partial b} = \\sigma'(t);\n",
    "\\end{equation}\n",
    "\\begin{equation}\n",
    "\\frac{\\partial y}{\\partial x_1} = \\frac{\\partial y}{\\partial t} \\times \\frac{\\partial t}{\\partial x_1} = \\sigma'(t) \\times w_1;\n",
    "\\end{equation}\n",
    "\\begin{equation}\n",
    "\\frac{\\partial y}{\\partial x_2} = \\frac{\\partial y}{\\partial t} \\times \\frac{\\partial t}{\\partial x_2} = \\sigma'(t) \\times w_2.\n",
    "\\end{equation}\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.9525741268224334\n",
      "Before backward w: [0 1] b: 0\n",
      "After backward w: [-0.09035332  0.86447002] b: -0.045176659730912\n"
     ]
    }
   ],
   "source": [
    "def d_sigmoid(x):\n",
    "    # Derivative of sigmoid: f'(x) = f(x) * (1 - f(x))\n",
    "    fx = sigmoid(x)\n",
    "    return fx * (1 - fx)\n",
    "\n",
    "class Neuron_with_backward:\n",
    "    def __init__(self, w, b, name=''):\n",
    "        self.name = name\n",
    "        self.w = w\n",
    "        self.b = b\n",
    "        self.x = None\n",
    "        self.t = None\n",
    "        self.d_w = None\n",
    "        self.d_b = None\n",
    "    \n",
    "    def forward(self, x):\n",
    "        # Weight inputs, add bias, then use the activation function\n",
    "        t = np.dot(self.w, x) + self.b\n",
    "        self.x = x\n",
    "        self.t = t\n",
    "        return sigmoid(t)\n",
    "    \n",
    "    def backward(self, d_back):\n",
    "        # d_back is the gradient from the back neurons.\n",
    "        d_y_d_t = d_sigmoid(self.t)\n",
    "        d_t_d_w = self.x\n",
    "        d_t_d_x = self.w\n",
    "        self.d_w = d_y_d_t * d_t_d_w \n",
    "        self.d_b = d_y_d_t\n",
    "        self.d_x = d_y_d_t * d_t_d_x\n",
    "        # SGD update\n",
    "        self.w = self.w - self.d_w * d_back\n",
    "        self.b = self.b - self.d_b * d_back\n",
    "        return self.d_x * d_back\n",
    "        \n",
    "        \n",
    "init_w = np.array([0, 1])    # w1 = 0, w2 = 1\n",
    "init_b = 0    # b = 0\n",
    "n = Neuron_with_backward(init_w, init_b)\n",
    "\n",
    "x = np.array([2, 3])    # x1 = 2, x2 = 3\n",
    "print(n.forward(x))    # 0.9525741268224334\n",
    "print('Before backward w:', n.w, 'b:', n.b)\n",
    "n.backward(1)\n",
    "print('After backward w:', n.w, 'b:', n.b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build a simple neural network with backward\n",
    "<img src=\"Images/back.png\" width=\"30%\">\n",
    "The figure above shows consider the backward of $h_1$.\\\n",
    "The detailed process of backward can be found in the \"backward\" function in \"NeuralNetwork_with_backward\"."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.7216325609518421\n",
      "Before backward.\n",
      "h1.w: [0 1] h1.b: 0\n",
      "h2.w: [0 1] h2.b: 0\n",
      "o1.w: [0 1] o1.b: 0\n",
      "After backward.\n",
      "h1.w: [0. 1.] h1.b: 0.0\n",
      "h2.w: [-0.01815009  0.97277487] h2.b: -0.009075042588152822\n",
      "o1.w: [-0.19135215  0.80864785] o1.b: -0.20087900792592797\n"
     ]
    }
   ],
   "source": [
    "class NeuralNetwork_with_backward:\n",
    "    '''\n",
    "    A neural network with:\n",
    "    - 2 inputs\n",
    "    - a hidden layer with 2 neurons (h1, h2)\n",
    "    - an output layer with 1 neuron (o1)\n",
    "    Each neuron has the same weights and bias:\n",
    "    - w = [0, 1]\n",
    "    - b = 0\n",
    "    '''\n",
    "    def __init__(self):\n",
    "        init_w = np.array([0, 1])\n",
    "        init_b = 0\n",
    "\n",
    "        # The Neuron class here is from the previous section\n",
    "        self.h1 = Neuron_with_backward(init_w, init_b, 'h1')\n",
    "        self.h2 = Neuron_with_backward(init_w, init_b, 'h2')\n",
    "        self.o1 = Neuron_with_backward(init_w, init_b, 'o1')\n",
    "\n",
    "    def forward(self, x):\n",
    "        out_h1 = self.h1.forward(x)\n",
    "        out_h2 = self.h2.forward(x)\n",
    "\n",
    "        # The inputs for o1 are the outputs from h1 and h2\n",
    "        out_o1 = self.o1.forward(np.array([out_h1, out_h2]))\n",
    "\n",
    "        return out_o1\n",
    "    \n",
    "    def backward(self, d_back):\n",
    "        # d_back is the gradient from the back loss function.\n",
    "        d_outo1_d_outh1, d_outo1_d_outh2 = self.o1.backward(d_back)\n",
    "        self.h1.backward(d_outo1_d_outh1)\n",
    "        self.h2.backward(d_outo1_d_outh2)\n",
    "\n",
    "nn = NeuralNetwork_with_backward()\n",
    "x = np.array([2, 3])\n",
    "print(nn.forward(x))    # 0.7216325609518421\n",
    "print('Before backward.')\n",
    "print('h1.w:', nn.h1.w, 'h1.b:', nn.h1.b)\n",
    "print('h2.w:', nn.h2.w, 'h2.b:', nn.h2.b)\n",
    "print('o1.w:', nn.o1.w, 'o1.b:', nn.o1.b)\n",
    "nn.backward(1)\n",
    "print('After backward.')\n",
    "print('h1.w:', nn.h1.w, 'h1.b:', nn.h1.b)\n",
    "print('h2.w:', nn.h2.w, 'h2.b:', nn.h2.b)\n",
    "print('o1.w:', nn.o1.w, 'o1.b:', nn.o1.b)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Build mean squared error (mse) loss"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0.5\n",
      "2\n"
     ]
    }
   ],
   "source": [
    "def mse_loss(label, pred):\n",
    "    return (label - pred) ** 2\n",
    "\n",
    "def d_mse_loss(label, pred):\n",
    "    # Derivative of MSE Loss\n",
    "    return -2 * (label - pred)\n",
    "\n",
    "y_true = np.array([1, 0, 0, 1])\n",
    "y_pred = np.array([0, 0, 0, 0])\n",
    "\n",
    "print(mse_loss(y_true, y_pred).mean()) # 0.5\n",
    "print(d_mse_loss(0, 1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Prepare training set\n",
    "Name | Weight | Height | Gender\n",
    "- | :-: | :-: | :-\n",
    "Alice | 133 | 65 | F\n",
    "Bob | 160 | 72 | M\n",
    "Charlie | 152 | 70 | M\n",
    "Diana | 120 | 60 | F\n",
    "\n",
    "The table above is the training set.\\\n",
    "Let's train our network to predict someone’s gender given his/her weight and height:\n",
    "<img src=\"Images/app.png\" width=\"50%\">\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [],
   "source": [
    "data = np.array([\n",
    "  [133, 65],    # Alice\n",
    "  [160, 72],    # Bob\n",
    "  [152, 70],    # Charlie\n",
    "  [120, 60],    # Diana\n",
    "])\n",
    "labels = np.array([\n",
    "  1,    # Alice\n",
    "  0,    # Bob\n",
    "  0,    # Charlie\n",
    "  1,    # Diana\n",
    "])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Data Normalization"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "data_mean = np.mean(data, 0)\n",
    "data_std = np.std(data, 0)\n",
    "def norm(data):\n",
    "    return (data - data_mean) / data_std\n",
    "\n",
    "data = norm(data)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## SGD\n",
    "Optimization algorithm called stochastic gradient descent (SGD) that tells us how to change our weights and biases to minimize loss. It’s basically just this update equation:\n",
    "\\begin{equation}\n",
    "w_1 \\leftarrow w_1 - \\eta \\frac{\\partial L}{\\partial w_1}.\n",
    "\\end{equation}\n",
    "Here, $L$ is the loss function, and $\\eta$ is a constant called the learning rate that controls how fast we train."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Train a simple neural network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch 50 loss: 0.160155\n",
      "Epoch 100 loss: 0.066590\n",
      "Epoch 150 loss: 0.033055\n",
      "Epoch 200 loss: 0.020086\n",
      "Epoch 250 loss: 0.013858\n",
      "Epoch 300 loss: 0.010362\n",
      "Epoch 350 loss: 0.008177\n",
      "Epoch 400 loss: 0.006702\n",
      "Epoch 450 loss: 0.005650\n",
      "Epoch 500 loss: 0.004866\n",
      "Epoch 550 loss: 0.004262\n",
      "Epoch 600 loss: 0.003785\n",
      "Epoch 650 loss: 0.003398\n",
      "Epoch 700 loss: 0.003079\n",
      "Epoch 750 loss: 0.002813\n",
      "Epoch 800 loss: 0.002586\n",
      "Epoch 850 loss: 0.002392\n",
      "Epoch 900 loss: 0.002224\n",
      "Epoch 950 loss: 0.002077\n",
      "Epoch 1000 loss: 0.001947\n"
     ]
    }
   ],
   "source": [
    "# build a simple neural network with backward\n",
    "nn = NeuralNetwork_with_backward()\n",
    "\n",
    "# learning rate\n",
    "learn_rate = 0.1\n",
    "\n",
    "# number of times to loop through the entire dataset\n",
    "epochs = 1000\n",
    "\n",
    "for epoch in range(epochs):\n",
    "    for x, label in zip(data, labels):\n",
    "        pred = nn.forward(x)\n",
    "        d_L_d_ypred = d_mse_loss(label, pred)\n",
    "        nn.backward(learn_rate * d_L_d_ypred)\n",
    "    \n",
    "    if (epoch + 1) % 50 == 0:\n",
    "        y_preds = np.apply_along_axis(nn.forward, 1, data)\n",
    "        loss = mse_loss(labels, y_preds).mean()\n",
    "        print(\"Epoch %d loss: %.6f\" % (epoch + 1, loss))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Evaluate a simple neural network"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Emily: 0.968267\n",
      "Frank: 0.070288\n"
     ]
    }
   ],
   "source": [
    "emily = norm(np.array([[128, 63]]))    # 128 pounds, 63 inches\n",
    "frank = norm(np.array([[155, 68]]))    # 155 pounds, 68 inches\n",
    "\n",
    "print(\"Emily: %.6f\" % nn.forward(emily[0])) # 0.968267 - F\n",
    "print(\"Frank: %.6f\" % nn.forward(frank[0])) # 0.070288 - M"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
